This Rady School of Management Masters of Business Analytics team has taken on the client Rady Children’s Hospital (RCH) in order to improve an anticipated demand model for their Emergency Department. This is a particularly important aspect of the hospital system to be able to estimate demand since bed and staff resources are limited and there are costs or losses to have too many or too little staff on the floor at any given time. In order to assist in the decision-making process for allocating the proper amount of staff months in advance, our team has performed data exploration and analysis on individual patients and staff scheduling to best address this issue.
In our analysis, we have found that the best predictors of patient volume tend to be day of the week, week number of the year, and month. Adding external data factors such as historical CDC flu data for San Diego county or weather data did not change results significantly. A second important finding is based on discovering the bottleneck of operations during busy times as revealed by the data. Upon filtering data to show only times during which all available hospital beds are occupied, we sought to verify whether beds or physicians tended to be the rate-limiting factor for allowing maximal patient throughput. After this patient flow analysis, we have found that there is an ideal number of physicians to be staffed during peak hours. There is evidence that low numbers of physicians will cause a bottleneck in patient throughput while excessively high numbers will have a plateau effect and will have no effect on increasing patient throughput. In this project paper, we will discuss the motivation for conducting analysis on these topics, elaborate on various techniques and methodology, provide evidence in a thorough analysis section, and finally give actionable recommendations for the decision makers of the RCH Emergency Department.
As we all know, Rady Children’s Hospital San Diego (RCHSD) is the largest children’s hospital in California and is recognized by US News & World Report as one of the best children’s hospitals in the country. The problem faced by the Rady Children’s Hospital is that RCH serves a huge number of patients everyday specifically the increasing patient volumes in the Emergency Department . Therefore, a decompression plan for maintaining patient throughput, reduce patient waiting time, and retain patient satisfaction level is urgently needed.
For example, during the flu season, RCH need to keep the left-without-being-seen rate at average low levels or even zero by utilizing the decompression plan. However, there are much more factors may influence the patient volumes except for flu season. Volumes surge at different points each year due to this reason. Therefore, it’s time for us to figure out the exactly correct time to expand the Emergency Department capacity so that the costs will be minimized and the service delivery will not be impacted negatively.
We go further to search some comments on the Google Website and Yelp App and found that the key word people mentioned the most frequently is “waiting-time”. People usually said that the service is pretty good but they have to wait for long time. From those comments and reviews online, we can profoundly feel the serious consequences of this problem brought to RCH. Broadly speaking, RCH would suffer from the losses of patients volumes and decreased patients’ satisfaction level.
The next step for us is to develop predictive models to determine when and which overflow units should be opened and decide planned staffing levels for the Emergency Department based on the dataset we have on hand such as patient volume trending, flu prevalence data, and other external data.
By deploying predictive models, the decision makers - directors for Emergency Department could set a much more efficient staffing schedule, for example, how many doctors and nurses are needed for a certain day based on the patient volume for a specific day which is four months later since four months is the time window that ED has to schedule ahead of. Furthermore, optimize patients’ waiting time and keep the left-without-being-seen rate pretty low.
In order to help decision makers implement the action/plan much easier and more convenient, we built a Shiny App which they could update the dataset anytime to generate the predictive outcome. According to generated predictions on patient volume per hour for a certain day and the patient distribution pattern within a day, the directors for Emergency Department could decide the staff schedule for that day.
The following itemized list of relevant features requested on a per individual basis:
Timestamps - Registration, bed/room assignment, physician visit, and discharge time are important for considering the operations flow to identify possible bottlenecks within the ED during busy times.
Chief Complaint - Provides the ability to view the most popular reasons for coming to the ED, as well as the extent of seasonality for such reasons. For example, chief complaints of a fever fluctuate with flu season. Fever is the dominant chief complaint in this data set.
Diagnosis - May differ from chief complaint which the patient or parent of the patient may perceive as the underlying issue requiring treatment. Provides better ability to allocate resources depending on what treatments are called for most frequently.
Gender - Investigate any correlation between gender and different chief complaints, diagnoses, length of stay, or behavioral patient levels.
Acuity - The urgency of each case is important for considering the average wait time for these different patient segments, as well as for staffing different acuities properly.
Behavioral Patient - This patient segment is important to identify due to the substantially larger time required to treat and supervise mental health patients. This variable is a factor of either yes or no.
Physicians Attending - All records of physician visits are recorded and time stamped. This factor is important to include for the sake of investigating bottlenecks to patient throughput.
San Diego Weather - Local weather can have a big influence on the overarching health of the surrounding population, whether it be from consistently low temperatures causing flu-like illness or large fluctuations in temperature making it difficult for immune systems to adjust quickly. One important consideration is that this is the data for the city of San Diego only. Further analysis would be bolstered by bringing in additional cities and matching them to different patient segments according to their zip codes since weather may vary from city to city.
Wind Speed - High wind speeds can lead to a more extreme effective low temperature. This can be problematic especially during colder periods when children are already susceptible to cold, fever, and flu in the winter time.
Rainfall - Rainy days may deter individuals from taking a trip to the ER if less severe medical complications arise. Rain may also increase the instance of sickness as days grow more cold, wet, and lacking sunshine for essential vitamin D.
Center for Disease Control Flu Instances - Historical records kept by the CDC gives a good relative volume for the severity of flu seasons over time as well as the variation in volume reported through this center.
San Diego County School District Holidays and Academic Milestones - School is a large part of the lives of our target customer segment, children, and has a great influence on their physical and mental well-being.
School Schedule - The start of school can be a difficult time for students to adjust and could be linked to a mental shock from entering a new environment away from home. May be correlated with an increase in stress which could be linked to an increase in Mental Health patients throughout the school year.
Holidays - Holidays suggest that parents may not have to go to work and will have more available time to take their child to an ER for more urgent conditions. This could also possibly correlate with an increase in physical activities or emotional distress caused by children being at family gatherings.
There are multiple constraints within the RCH ED environment which are important to take into consideration when modeling and preparing to provide business recommendations. Some of these factors are fixed until further notice and should not be factors which we suggest to change in our Recommendations write up. These factors include limitations on staff scheduling, available beds and zones, and maintaining certain performance standards.
If we are to make recommendations for more efficiently staffing the ED, it is important to note a few items which are not so flexible as far as policy goes in order to prevent any suggestions which are not practical to implement. The first notable item is the schedule making time frame. Due to the unionization of nurse staffing, initial schedules are created 4 months ahead of schedule so as to allocate the correct number of staff and call in a reasonable number of travel nurses with plenty of notice. Changes to this scheduling time frame are typically made, at latest, 2 months ahead of time in order to compensate for new information which may lead to more effective schedules to match staffing assignment to anticipated patient demand. For these reasons, we will focus our interface on displaying predictions for 4 months and 2 months out, while allowing the option to look at other times outside of this time frame with minimal manual selection.
Along the same lines of staffing restrictions, there are minimum and maximum staff which are typically kept in the ED. Although there must be a small handful of physicians present in the ED at any given time (even in the late hours of the night to be prepared for emergency situations), the maximum number of physicians is somewhat more flexible. In our process flow analysis, we break down the number of physicians during peak hours to gain some insight into the optimal number of physicians to maintain maximum patient throughput with minimal staff. This allows for a maximum throughput without wasting any resources which would be more wisely allocated at a different time. Finally, there is a set rule for staffing ratios of physicians to nurses, which is typically 1:3. This gives a relative scheduling guideline for allocating nurses given that the ED is making an accurate decision about how many physicians should be required during any given hour. Minimum nursing staff is typically 13 nurses during Summer mornings and nights, while maximum nursing staff tends to be about 33 during the busy nights of Winter season.
The number of beds available fluctuates with how busy the ED is expected to be, as well as with time of the day and week. Within the ED, certain zones can be shut down if there is a low utilization rate for beds since there are operational costs to keeping the lights on for these areas. As a peripheral part of the ED, there is a Cardiology unit (Cardio) which is adopted for over a dozen additional beds depending on the time of day. Better known as the ED South for our stakeholders, this Cardio unit is at their disposal during weekends and also when the Cardio department no longer needs these beds, typically relinquishing the space to the ED by 5PM each weekday depending on how busy the Cardio unit has been that day. It is not a reasonable suggestion at this time to recommend the ED add more beds to increase their capacity during busy times. It is a much more practical suggestion to recommend that a particular zone of the ED open an hour earlier or close an hour later than is typically scheduled. Without the ED South unit open, the maximum number of beds to be utilized is ~43. When this Cardio zone has its beds at the disposal of the ED, the maximum number of beds rises to ~60.
Rady Children’s Hospital is a mission and value driven business with a focus on improving patient outcomes. Staff universally strives to provide the highest quality care to its patients and to maintain a high level of patient satisfaction while minimizing (or eliminating) and LWBS patients (those who have Left Without Being Seen). It is presumed that excessively long wait times are a primary cause for potential LWBS cases and is one of the motivators for the onset of this project. A second performance standard is to minimize costs of operating the ED. One of the largest costs in the day to day operation of the ED is paying for staff salaries, so it is imperative to be mindful of this when optimizing staffing numbers. There are costs to both understaffing and overstaffing in the ED. Understaffing suggests that there will be longer wait times and higher LWBS rates. In addition to causing dissatisfied customers (or lack thereof), there is also lost revenue for the patients who elect to travel elsewhere for treatment or decide to not get treatment from an ED whatsoever. Overstaffing, on the other hand, has a quickly diminishing return since the utilization of staff drops as the number of excess physicians increases. It is therefore of great importance to have an accurate gauge for staffing requirements well ahead of time in order to avoid such conflicts and detriment to the bottom line of the ED as a business operations consideration.
It is important to understand the nature of the existing model in order to construct improved iterations which still take into account all significant factors for prediction. The current patient volume demand model utilized by decision makers at the RCH ED consists of a historical average for a particular time grouping. These groupings are aggregated into a Low, Medium, or High season depending on the relative average patient volume for that month, as well as the grouping for day of the week, whether it be Sunday - Monday or Tuesday - Saturday. Given the historical patient volumes for these time frames, RCH has a policy of staffing to 75% of the historical demand incurred from patient volume. The 75% level suggests that, according to the accumulated data collected, the ED will not have a shortage of staff 75% of the time. This implies that there would be an inherent shortage of staff in the event of a new maximum volume or a scenario in which a phenomenon such as flu season has an earlier than expected start. When such a scenario arises, it typically at minimum two weeks for staff managers to adjust schedules accordingly and give nurses ample time to legally change their schedules given union restrictions. Our goal in improving the state of the current model is to better gauge patient volume demand so as to allow for staffing managers to better allocate resources initially at the 4 month period, as well as allowing these decision makers to check again at the 2 month period to make more fine tuned adjustments in the case where the model gives a slightly different and more accurate prediction for a more proximal time period. With the help of different stakeholders and engineers within RCH, the team was able to re-create this model and understand what factors went into the model as well as the important aspects the model provides to make staffing decisions less strenuous.
Rady Children’s Hospital has an existing model to staff the physicians. The model considers 4 major factors:
Season: High, Medium, Low Demand Season (Based on their understanding of average daily patient volume)
Weekday: Sun/Mon, Tue-Sat (Assuming Sunday and Monday have more patients than that of the other days of week)
Acuities: Acuities1-2-3, Acuities4-5 (Industrial agreement of treating throughput: 2.15 pts/hour for acuities1-2-3, and 3.00 pts/hour for acuities4-5)
Historical Volume 75th percentile of the patient volumes for a specific hour given season, weekday and acuitites (Estimated by tradeoffs between cost and benefits)
Currently, the initial physicians’ time schedule is made by Dr. Carstairs. The schedule is usually released 4 month in advance, and is subjected slightly changes no later than 2 weeks prior (but it’s hard and require negotiation).
The emergency department of Rady Children’s Hospital uses different colors to label different zones. Each zone has its specific open-close hours, number of beds available, and priority for treating different patients. Below is a sample schedulling table:
Suppose the scheduler is making schedules for one particular demand season, one particular weekday group, and one particular hour of the day. The scheduler will look back to all days in the past that have same characteristic, calculate the historical hourly number of patient arrivals. Those volumes will form a distribution. We then find the number of the 75th percentile of the distribution, and use the number as the point of estimate of the expected number of patient arrivals.
The patient usually will stay for treatment for quite a long period, in order to catch this, the model currently used also conduct a “smooth” calculation, and use the number after smoothing as the final patient volumes estimate The smooth values are calculated as follows:
The metric for measuring the supply, which is staffing need in the current model, is calculated as follows:
Patient Arrivals Estimate among \(h_0\), \(h_{-1}\), and \(h_{-2}\)The in-sample daily RMSE is 78.71, and MAE is 72.3.
The out-of-sample daily RMSE is 50.39, and MAE is 43.32.
The in-sample hourly RMSE is 5.53, and MAE is 4.16.
The out-of-sample hourly RMSE is 6.86, and MAE is 5.3.
The current model does a affordable job on estimating patient volume. Considering the case that the schedule has to be made 4 months in advance, this model will help the decision maker to have a good general picture and make better strategic decisions. However, the error of this model is considerably large consider its 43.32 mean absolute error per day. In the following sections, we will analyze and discuss the potential improvement of this model.
The patient volume fluctuated a lot. It started to descend since Feburary, go a little higher in July and August and then omleted the upward trend in Feburary. One thing that worth talking about is the daily ptient volume went down a little in November compare to that of October. The patient volume begins to surge from the middle of December.
We see more variation in daily patient volumes in January, Feburary and December.
June, July and August have relatively low daily patient volume.
Peak appeared in January and Feburary.
Clearly, the average daily patient volumes across the whole week varied. But from the patient volume history that we have, we can only conclude that the patient volume on Tuesday is normally different from that of Friday and Saturday (95% confidence). The plot showed that Monday and Tuesday tend to have more patients, but there’s not a statistically significant difference compare to other days of the week.
Currently, ED use Tuesday through Saturday, as well as Monday and Sunday as the way of grouping weekdays.The result suggests that Tuesday should be grouped as high demand weekdays.
The differences of average patient volume across different months are greater. After Bonferroni adjustment (which reduce the likelihood of incorrectly reject the null hypothesis), we can roughly group them into three clusters:
The result matches with the current month grouping of the model.
From the plots we can see, the average flow time of a patient does not vary too much either across the 4 year period or across the 7 differnet weekdays. However, the flow time varies a lot between non-behavioral and behavioral patients. The flow time for non-behavioral patients averages at around 200 minutes, roughly 3 hours; whereas the flow time for behavioral patients averages at aroung 450 minutes, twice as much as that of non-behavioral patients. What’s more, the variation is significantly large in the distribution of flow time of behavioral patients.
We select the top 15 chief complaints based on number of observations of each unique chief complaints. From those 15 chief complaints, we calculate the average flow time, shown above. From the chart, we can see suicidal patients take significantly longer time to be discharged from the hospital, on average last for 609 minutes. While, patients who claim to be the next chief complaints that has the second highest average flow time only take about 255 minutes.
From the graph shown above, we observe abnormality starting from September, 2017. The average wait time increased significantly across the week. (We believe we need to further validate with the ED).
Besides that,
This graph shows the average time waiting in bed for a physician to come, by which hour the patient came to the emergency department. The size of the dot represents the number of observations, the larger the dot the more observations. It’s clear that during peak hours (17-midnight), the average time waiting in bed for a doctor to come increases.
As is shown in above, Tuesday should belong to the high demand weekday group.
The in-sample daily RMSE is 76.99, and MAE is 70.68.
The out-of-sample daily RMSE is 48.61, and MAE is 42.1.
The in-sample hourly RMSE is 5.54, and MAE is 4.17.
The out-of-sample hourly RMSE is 6.88, and MAE is 5.32.
Compared with the current model that group Tuesday into low demand weekday, the refined model does a little better in improving the predictive power according to the metrics digits.
To calculate the number of physicians appeared within each hour, we group by the name of physician and date. For a particular physician, we find out the time of both first and last show up time, and we assume that during the time period between fist and last show up time, the physician was working. We then calculate the census of all physician, and the result is the number of physicians working within each hour.
The assumption of this calculation:
Physician do not leave the hospital early in the day, and return late in the same day (i.e. go home at 1am and back to work at 11pm)
FALSE Classes 'tbl_df', 'tbl' and 'data.frame': 35040 obs. of 4 variables:
FALSE $ time : POSIXct, format: "2014-01-02 00:00:00" "2014-01-02 01:00:00" ...
FALSE $ count: int 3 4 4 5 4 5 5 6 6 6 ...
FALSE $ date : Date, format: "2014-01-02" "2014-01-02" ...
FALSE $ hour : int 0 1 2 3 4 5 6 7 8 9 ...
FALSE - attr(*, "spec")=List of 2
FALSE ..$ cols :List of 4
FALSE .. ..$ time :List of 1
FALSE .. .. ..$ format: chr ""
FALSE .. .. ..- attr(*, "class")= chr "collector_datetime" "collector"
FALSE .. ..$ count: list()
FALSE .. .. ..- attr(*, "class")= chr "collector_integer" "collector"
FALSE .. ..$ date :List of 1
FALSE .. .. ..$ format: chr ""
FALSE .. .. ..- attr(*, "class")= chr "collector_date" "collector"
FALSE .. ..$ hour : list()
FALSE .. .. ..- attr(*, "class")= chr "collector_integer" "collector"
FALSE ..$ default: list()
FALSE .. ..- attr(*, "class")= chr "collector_guess" "collector"
FALSE ..- attr(*, "class")= chr "col_spec"
We believe this is currently the optimal approximation we can find based on the data we have. But we still find that, the number of physician seems to be higher than we expected. The maximum number is 18. And the distribution is as follows:
** Possible improvement: ** Calculate time difference between previous removal from a patient and next assignment to another patient of one physician. But we need to figure out the threshold (length of time) of how to define a physician is not at work.
The number of patient output is defined by calculating how many patients encountered the event Patient departed from ED.
We also define utilization as \[\frac{\text{number of beds used}}{\text{total beds available}}\].
The bed schedule given to us cannot be perfectly matched with the calculation we’ve had. 3751 out of 34740 have number of beds observed greater than scheduled.
The reasons may vary, we think:
Based on previous analysis, demand peak usually happens during 6pm through midnight. And the ED will open as many beds as possible during those hours too.